Example: The Newton method for finding roots of functions

The Newton method is an iterative method to solve equations of the form $f(x)=0$, i.e. to find roots or zeros $x^\ast$ such that $f(x^\ast) = 0$. Given an initial guess $x_0$, we repeat the iteration

$$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}.$$

Variables

Let's implement the Newton algorithm in Julia. We start from an initial condition $x_0$:


In [ ]:
x_0 = 3

Julia always returns a value from any expression:


In [ ]:
x_0

We can use LaTeX notation and tab completion for Unicode, e.g. x\_0<TAB>:


In [ ]:
x₀ = 3

In [ ]:
x₀

Values in Julia have associated types. We can find the type of a variable using the appropriately-named typeof function:


In [ ]:
typeof(x₀)

We can guess that this means an integer with 64 bits. [This result will be Int32 if you have a 32-bit machine.]

Simple functions

We need to define a function whose roots we wish to find. Let's find square roots of two, for example. Julia provides a concise mathematical syntax for defining simple functions:


In [ ]:
f(x) = x^2 - 2

In [ ]:
f(x_0)

We also need the derivative function, $f'$. For the moment, let's just give it my hand. (Later, we will see a neat way to avoid this.) We may like to write f', using the apostrophe, '), but the apostrophe turns out to be a special character in Julia, so we get an error if we try to define a variable or function named f':


In [ ]:
f'(x) = 2x

However, Unicode comes to our rescue: f\prime<TAB>:


In [ ]:
f′(x) = 2x

Now we can do one step of our algorithm; mathematical operations work like we expect:


In [ ]:
x_1 = x_0 - f(x_0) / f'(x_0)

Note that division of integers using / gives a floating-point result:


In [ ]:
typeof(x_1)

Iteration

We now need to repeat such steps several times. Julia has for loops and while loops. As usual, we tend to use for loops when we know how many iterations we want, and while when we iterate until a certain condition is attained.

Ranges and arrays

Let's start with a simple for. Blocks of code in Julia always end with end:


In [ ]:
for i in 1:5
    println(i)  # print the value of i followed by a new line
end

Here, a variable i is introduced that is local to the loop, i.e. it exists only inside the loop:


In [ ]:
i

i takes each value in the iterable collection 1:5. Let's ask Julia what this object 1:5 is:


In [ ]:
1:5

As usual, Julia returns a value, but in this case it is (at first glance) apparently not very helpful. What type is this object?


In [ ]:
typeof(1:5)

We see that Julia has a special type (actually, several different types) to represent ranges, in which the elements are calculated each time a new element is required, rather than stored. We can see all the elements that will be produced using the collect function:


In [ ]:
v = collect(1:5)

The result is an object of a new type, an Array, in this case one whose elements are integers and that is of dimension 1. Note that 1 is not the number of elements in the array, which is called length:


In [ ]:
length(v)

Arrays are also iterable, so we can iterate over an Array using a for loop. 1-dimensional arrays, also called Vectors, are constructed using square brackets:


In [ ]:
w = [3, 4, 7]
for i in w
    println(2*i)
end

Implementing the Newton method

We are now ready to implement the Newton method:


In [ ]:
x_0 = 3
x = x_0

for i in 1:10
    x_new = x - f(x) / f′(x)
    println(i, "\t", x_new)
    x = x_new
end

In this case, we see that the method rapidly converges to one of the square roots of two. Which root it converges to depends on the initial condition:


In [ ]:
x_0 = -3
x = x_0

for i in 1:10
    x_new = x - f(x) / f′(x)
    println(i, "\t", x_new)
    x = x_new
end

The Newton method is, in fact, not guaranteed to converge to a root (although it always does so if started "sufficiently close" to a root, at a rate that is known). Furthermore, which root it converges to can depend sensitively on the initial condition. Let's calculate this for several initial conditions.

First we create a set of initial conditions on the real line, say between -5 and 5. We now include a step size in the range:


In [ ]:
initial_conditions = -5:0.1:5
collect(initial_conditions)   # use tab completion for long variable names!

This range type is different:


In [ ]:
typeof(-5:0.1:5)

The array is also a new type: it is now an array of 64-bit floating-point numbers. We can also see that the {...} syntax thus gives the parameters of the Array type.

For each of these initial conditions, we will run the Newton algorithm for a certain number of steps and store the resulting value. We thus need a new array in which to store the results. One way of creating an array is using the similar function, which, by default, creates an array of the same type and same size, but with (currently) uninitialized values:


In [ ]:
roots = similar(initial_conditions)

Now we do the work:


In [ ]:
for (j, x_0) in enumerate(initial_conditions)
    x = x_0

    for i in 1:100
        x = x - f(x) / f′(x)
    end
    
    roots[j] = x
end

Here, enumerate iterates over initial_conditions but returns not only the value at each step, but also a counter. (j, x_0) is called a tuple (an ordered pair):


In [ ]:
t = (3, 4)
typeof(t)

NB: In Julia v0.4, tuples have been completely reworked, and the resulting type is now Tuple{Int64,Int64}.


In [ ]:
roots

Julia does not show all of the contents of an array by default. We can see everything using showall:


In [ ]:
showall(roots)

We see that, apart from the NaN value, the results are not very exciting. Let's work harder with more initial conditions. We can find out how long the calculation takes using @time by wrapping the code in a begin...end block:


In [ ]:
@time begin 
    initial_conditions = -100:0.01:100
    roots = similar(initial_conditions)

    for (j, x_0) in enumerate(initial_conditions)
        x = x_0

        for i in 1:1000
            x = x - f(x) / f′(x)
        end

        roots[j] = x
    end
end

Packages and visualisation

There are now many values stored in the array, so it is hopeless to examine them:


In [ ]:
length(roots)

Instead, we turn to visualisation. There are several plotting packages in Julia: Gadfly is a native Julia library that produces beautiful plots; [PyPlot] is a Julian interface to the well-known matplotlib Python library.

Let's start with PyPlot. First we need to download the package. Julia provides a built-in package manager, called Pkg, that gracefully handles dependencies, etc. To tell Julia that we require the package, we do


In [ ]:
Pkg.add("PyPlot")

This step is necessary only once. In each session where we need to use PyPlot we do


In [ ]:
using PyPlot

Note that this process of loading a package currently can take a considerable time. Work is in progress to reduce this loading time.


In [ ]:
figure(figsize=(6,4))
plot(roots);

Performance 1

If we are used to the performance of C or Fortran, we might start to be unhappy with Julia's speed in this rather simple calculation. A close inspection of the output of the @time operation, however, gives us a very important clue: Julia apparently allocated over a gigabyte of memory to do a simple loop with some floating-point numbers!

This is almost always a very strong signal that there is something very wrong in your Julia code! In our case, it is not at all clear what that could be. It turns out to be something very fundamental in Julia:

[almost] NEVER WORK WITH GLOBAL OBJECTS!

Due to technical details about the way that Julia works, it turns out that GLOBALS ARE BAD. What is the solution? PUT EVERYTHING INTO A FUNCTION! Let's try following this advice. We take exactly the same code and just plop it into a new function. For longer functions, Julia has an alternative syntax:


In [ ]:
function do_roots()
    initial_conditions = -100:0.01:100
    roots = similar(initial_conditions)

    for (j, x_0) in enumerate(initial_conditions)
        x = x_0

        for i in 1:1000
            x = x - f(x) / f′(x)
        end

        roots[j] = x
    end
    
    roots
end

Note the last line of the function. This will automatically return the value of the roots object as the output of the function. So we can call it like this:


In [ ]:
roots = do_roots()

Now how long did it take?


In [ ]:
# a semi-colon suppresses output
@time roots = do_roots();
@time roots = do_roots();

It allocates a million times less memory, and is 50 times faster! This is the first lesson about performance in Julia: always put everything in a function.

Note that the first time we ran the function, it took longer. This is due to the fact that the first time a function is run with arguments of given types, the function is compiled. Subsequent runs with the same types of arguments reuse the previously-compiled code.

Exercise: Use a while loop with a suitable condition to improve the code for the Newton method.

Generic functions and methods

Our code currently is not very flexible. To make it more flexible, we would like to pass in arguments to the do_roots function. We can make a version which takes as arguments the functions f and f', for example. Functions are "first-class objects" in Julia, so they can just be passed around by name.

Let's redefine our function do_roots to accept these arguments:


In [ ]:
function do_roots(f, f′)
    initial_conditions = -100:0.01:100
    roots = similar(initial_conditions)

    for (j, x_0) in enumerate(initial_conditions)
        x = x_0

        for i in 1:1000
            x = x - f(x) / f′(x)
        end

        roots[j] = x
    end
    
    roots
end

Note the output that Julia returns: "generic function with 2 methods". This is a sign that something interesting is happening. In fact, we have not "redefined" the function do_roots; rather, we have defined a new version of do_roots, which accepts a different set of arguments. (The collection and types of the arguments that a function accepts are called its type signature.)

Indeed, the function do_roots now has two different methods or versions:


In [ ]:
methods(do_roots)

If we call do_roots with no arguments, the first version will be used; calling it with two arguments will call the second version. The process of choosing which "version" of a function to call is called dispatch. The fundamental fact in Julia is that (almost) all functions are such "generic functions" with multiple version, i.e. Julia is one of very few languages that use multiple dispatch. This turns out to be very natural for many applications in scientific computing.

The arguments f and f' in the second method of do_roots are names that are local to the function. We have functions of the same name defined globally, so we can pass those in:


In [ ]:
@time do_roots(f, f′);

This is faster than the first version of do_roots, but much slower than the good version. It turns out that Julia currently cannot optimize (inline) functions passed in this way. This is something to bear in mind -- there is (currently) a trade-off between user convenience and speed.

Julia also has anonymous functions, which allow us to pass in a function that we define "in the moment", without giving it a name. For example, let's do the exercise with a more interesting function:


In [ ]:
@time roots = do_roots(x->(x-1)*(x-2)*(x-3), x->3x^2-12x+11);

We see that anonymous functions are currently very slow. However, there are workarounds, e.g. the FastAnon package.

Let's visualize the results for this function:


In [ ]:
figure(figsize=(6,4))
plot(-100:0.01:100, roots)
xlim()

In [ ]:
figure(figsize=(6,4))
plot(-100:0.01:100, roots)
xlim(1, 3)

Complexifying Newton

The previous result is still pretty boring. It turns out that the Newton method gets interesting if we look for roots of functions of complex numbers. [If you are not familiar with complex numbers, you can think of them as pairs of real numbers that have certain mathematical operations defined.]

Let's try to use the Newton method starting from initial conditions distributed in the complex plane, i.e. pairs $a + bi$, where $i = \sqrt{-1}$. First of all let's see how Julia handles complex numbers:


In [ ]:
sqrt(-1)

Oh dear, that didn't work very well. It turns out that Julia is carefully designed to respect, when possible, the type of the input argument. Indeed, let's ask Julia what it thinks sqrt means:


In [ ]:
sqrt

We see that sqrt is a generic function, with the following methods:


In [ ]:
methods(sqrt)

Julia gives us a list of the available methods, together with links direct to the source code on GitHub (in IJulia) or locally (in Juno).

sqrt() acting on a Float64 returns a Float64 when it can, or throws a DomainError when its argument is negative. To get square roots in the complex plane, we must start with a complex number.

The names of types in Julia start with capital letters, so let's try Complex:


In [ ]:
Complex

As we will see later, types have functions with the same name that act as constructors to make objects of the type. Let's see the available functions with the name Complex. Note that output has changed rather a lot between Julia v0.3 and Julia v0.4:


In [ ]:
methods(Complex)

Now let's try playing with Complex:


In [ ]:
a = Complex(3)

In [ ]:
typeof(a)

In [ ]:
b = Complex(3, 4.5)

In [ ]:
typeof(b)

We see that Complex is also parametrised by the type of its real and imaginary parts.

We can also make complex numbers directly using im:


In [ ]:
3.0 + 4.0im

(Here, 4.0im is multiplication of 4.0 by im, which represents $i$, the imaginary unit.)

We can do complex arithmetic:


In [ ]:
a * b

What is happening here? Julia knows how to do * for complex numbers. Let's ask Julia what * is:


In [ ]:
*

So, mathematical operators are generic functions too! We can list all the ways to do *:


In [ ]:
methods(*)

All of these are defined in Julia itself. (Although the definitions for basic types like Int are only shallow wrappers around underlying C code.) We see that generic functions can be a complicated "patchwork" made of different methods for different types.

We can find the exact method used for a given operation using @which:


In [ ]:
@which a * b

Initial conditions: matrices

We are now ready to think about how to generate a grid of initial conditions of the form $a+bi$ in the complex plane, $\mathbb{C}$. Firstly, we could just iterate over the initial conditions in two repeated fors, e.g.


In [ ]:
for i in -2:1
    for j in -2:1
        println("($i, $j)")
    end
end

Here we have used string interpolation: the value of the variable i is substituted into the string instead of the sequence $i$. [Note that this is not recommended for performance-critical applications.]

But we still require somewhere to store the results. It is natural to use a matrix. A simple way of generating a matrix is the zeros function:


In [ ]:
zeros(3)

We see that with a single element, we generate a vector of zeros, while


In [ ]:
zeros(3, 3)

gives a matrix, i.e. a 2-dimensional Array.

Multiple dispatch allows Julia to provide convenience versions of functions like this. For example:


In [ ]:
zeros(-3:2)

creates a vector of the same length as the range!

However, this does not work for two different ranges:


In [ ]:
zeros(-3:2, -3:2)

We can use length for example:


In [ ]:
linear_initial_conditions = -5:0.1:5
roots = zeros(length(linear_initial_conditions), length(linear_initial_conditions))

However, if we try to store a complex number in this matrix, we find a problem:


In [ ]:
roots[1, 1] = 3+4im

An InexactError is a sign that we are trying to put a value into a type that it "doesn't fit into", for example a Float64 into an Int64, or, in this case, a complex number into a float. We must instead create the matrix to hold complex numbers:


In [ ]:
linear_initial_conditions = -5:0.1:5
roots = zeros(Complex128, length(linear_initial_conditions), length(linear_initial_conditions))

Here, Complex128 is just an alias (an alternative name) for Complex{Float64}, so called because two 64-bit Float64s require 128 bits of storage in total.

Now we can insert complex values into the matrix:


In [ ]:
roots[1, 1] = 3+4im

In [ ]:
roots

Implementing Newton for complex functions

We are now ready to make a version of Newton for complex functions. We will try to find cube roots of $1$ in the complex plane, by finding zeros of the function


In [ ]:
f(z) = z^3 - 1

with derivative


In [ ]:
f′(z) = 3z^2

In [ ]:
function do_complex_roots(range=-5:0.1:5)  # default value

    L = length(range)
    
    roots = zeros(Complex128, L, L)

    for (i, x) in enumerate(range)
        for (j, y) in enumerate(range)
            
            z = x + y*im
            
            for k in 1:1000
                z = z - f(z) / f′(z)
            end

            roots[i,j] = z
        end
        
    end
    
    roots
end

In [ ]:
roots = do_complex_roots(-5:0.1:5)

Now let's use PyPlot to plot the result. PyPlot only understands floating-point matrices, so we'll take the imaginary part:


In [ ]:
using PyPlot

In [ ]:
imshow(imag(roots))

Julia uses "column-major" storage, whereas Python uses "row-major", so in fact we need to flip $x$ and $y$:


In [ ]:
function do_complex_roots(range=-5:0.1:5)  # default value

    L = length(range)
    
    roots = zeros(Complex128, L, L)

    for (i, x) in enumerate(range)
        for (j, y) in enumerate(range)
            
            z = y + x*im
            
            for k in 1:1000
                z = z - f(z) / f′(z)
            end

            roots[i,j] = z
        end
        
    end
    
    roots
end

In [ ]:
imshow(imag(do_complex_roots(-3:0.01:3)))

Array comprehensions

Julia has a neat syntax for constructing arrays from iterables that is very similar to mathematical notation. For example, the squares of the numbers from 1 to 10 is

$$\{x^2: x \in \{1,\ldots,10\} \},$$

i.e. "the set of $x^2$ for $x$ from $1$ to $10$. In Julia we can write


In [ ]:
squares = [x^2 for x in 1:10]

Let's define a Newton function by


In [ ]:
function newton(x0, N=100)
    x = x0
    
    for i in 1:N
        x = x - f(x) / f′(x)
    end
    
    x
end

Then our Newton fractal can be written very concisely as


In [ ]:
methods(newton)

Note that the effect of a default argument is simply to create an additional method.


In [ ]:
function newton_fractal(range)
    [newton(b+a*im) for a in range, b in range]
end

We can add labels using PyPlot


In [ ]:
?text

In [ ]:
imshow(imag(newton_fractal(-3:0.01:3)), extent=(-3, 3, -3, 3))

text(1, 0, L"1")
text(reim(exp(2π*im/3))..., L"e^{2\frac{\pi}{3}}")
text(reim(exp(-2π*im/3))..., L"e^{-2\frac{\pi}{3}}")

Here, we have used Julia's reim function:


In [ ]:
reim(exp(2π*im/3))

It returns a tuple. The ..., or splat, operator, unpacks the tuple into two arguments. The L"..." notation is a special string macro available in the LateXStrings package used by PyPlot, that makes a LaTeX string.

Exercise: Make a version that accepts functions and experiment with other complex polynomials.

Introspection and iteration protocol

How does Julia know how to iterate using for through a vector or range? Let's look at a Unicode string:


In [ ]:
s = "aαbβ"  # use `\alpha<TAB>`

In [ ]:
typeof(s)

Julia provides access to several layers between the high-level code we write and the low-level machine code that is finally produced by the compilation process. The first of those is a "lowered" version of the code, in which high-level syntax is transformed to Julia code at a lower level


In [ ]:
@code_lowered iterate(10)

We see that there are three important functions: start, next and done. For example, iterating through a Unicode UTF8String is complicated, since characters have different lengths:


In [ ]:
s[1]

In [ ]:
s[2]

In [ ]:
s[3]

Nonetheless, we can iterate through s:


In [ ]:
function string_iterate(s)
    for c in s
        println(c)
    end
end

In [ ]:
string_iterate(s)

For example, we can extract a list of the characters in s with


In [ ]:
chars = [c for c in s]

In [ ]:
chars[1]

In [ ]:
typeof(chars[1])

Note that in Julia, strings are written with " and characters with ' (as in C).

The interface that allows us to iterate over an object using for is provided by three functions start, next and done that must be defined for that type:


In [ ]:
start(s)

In [ ]:
next(s, 1)

In [ ]:
next(s, 2)

In [ ]:
done(s, 2)

In [ ]:
@which start(s)

For more details about introspection, check out Leah Hanson's blog post.


In [ ]:
@which next(s, 1)